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1.  Summary  of  the  Report 

We  have  developed  a  novel  algorithm  for  tracking  an  object,  such  as  the  UAVs  shown  in  Fig.  1, 
in  a  sequence  of  images.  The  development  of  such  an  algorithm  is  motivated  by  the  idea  behind 
the  particle  filter  and  the  concept  of  feedback  from  control  theory.  We  will  thus  temporarily  refer 
to  this  algorithm  as  a  particle  filter  with  feedback  (PFF)  algorithm  for  image  tracking.  We  will 
briefly  describe  our  algorithm  and  compare  its  performance  with  some  existing  algorithms.  This 
comparison  indicates  that  the  proposed  tracking  algorithm  drastically  outperforms  the  existing 
methods  both  in  terms  of  tracking  accuracy,  robustness  and  tracking  speed. 


2.  Existing  Video  Trackers 

The  GVF  snake  tracker  was  developed  by  Ray  et  al.  (2002).  It  captures  the  object  to  be  tracked 
through  minimizing  an  energy  function,  defined  on  the  basis  of  internal  energy,  external  energy, 
shape,  size,  position,  and  sampling  of  the  contour.  Under  most  circumstances,  the  snake  tracker 
is  able  to  successfully  track  a  rolling  object. 

The  Monte  Carlo  tracker  was  developed  by  Cui  et  al.  (2006).  Based  on  the  object  movement 
information  and  the  image  intensity  features,  a  specialized  sample-weighting  criterion  is  tailored 
to  rolling  object  observed  in  vivo.  In  comparison  with  the  snake-based  trackers,  as  the  noise 
intensity  level  increases,  the  performance  of  a  snake  tracker  degrades  more  than  that  of  the 
Monte  Carlo  tracker.  More  details  on  such  a  comparison  can  be  found  in  Cui  et  al.  (2006). 


Frame  1  Frame  31  Frame  61 

Fig.  1.  Typical  UAV  images  that  represent  typical  targets  in  our  research. 
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3.  The  Proposed  Tracking  Algorithm 

The  development  of  the  algorithm  was  motivated  by  the  idea  behind  the  particle  filter  and  the 
concept  of  feedback  in  control  theory.  We  first  predict  the  leukocyte  position  using  the 
movement  information  of  the  previous  steps.  Samples  of  particles  are  then  generated  around  the 
predicted  position.  Unlike  in  the  Monte  Carlo  tracker  where  samples  are  generated  randomly, 
here  samples  are  generated  by  gridding  an  area  around  the  predicted  position.  The  number  and 
the  density  of  the  samples  are  adjusted  based  on  the  previous  movement  information.  At  each  of 
these  sample  points,  radial  edge  detection  is  applied  to  determine  if  the  point  is  within  the  target 
boundary.  Weighted  average  among  the  positions  of  those  sample  points  detected  to  be  within 
the  target  boundary  will  be  the  filtered  position  of  the  center  of  the  target  at  the  current  image 
frame.  The  weighting  for  a  sample  point  is  assigned  according  to  a  normal  distribution  with 
respect  to  its  distance  from  the  predicted  position. 

Various  components  of  the  algorithm  are  described  in  more  detail  as  follows. 

3.1.  Sample  Generation 

Samples  are  generated  around  a  predicted  position  of  the  target.  We  predict  the  target  position 
using  the  movement  information  of  previous  steps.  In  the  current  stage,  we  still  use  the  motion 
model  of  Cui  et  al.  (2006).  The  target  position  is  predicted  by 

xca+i  =  xcj  +  «(*c,r  -  V,)  +  fi(xca_  i  -  xct_2)  +  (1  -a-  P)(xCJt_2  -  xCJ  3), 

y c,t+i  y c,t* 

where  (xct+l , yct+l )  is  the  predicted  target  position  in  frame  t  + 1,  (xct,yct)  is  the  estimated 

position  in  frame  t,  and  a  and  p  are  non-negative  constants.  In  the  2nd  frame,  when  we  don’t 
have  three  previous  frames,  we  will  use  the  filtered  position  of  the  previous  frame.  Similarly,  in 
the  3rd  and  4th  frames,  we  will  use  only  the  information  of  the  previous  2  and  3  frames 
respectively. 

We  will  generate  the  samples  by  gridding  within  an  ellipsoid  that  is  centered  at  the  predicted 
position, 

(*-*c,f+i)2  (y-yc.t+ 1)2  ^ 

a  b 

where  A<a,b<l  .  The  number  and  density  of  samples  are  adjusted  by  the  previous 
measurement  information.  Shown  in  Fig.  2  is  an  illustration  of  a  set  of  samples. 


Fig.  2.  A  set  of  samples  generated  around  the  predicted  position. 
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3.2.  Image  Intensity  Measurement 

Suppose  (x,  y)  is  the  position  of  the  target  center.  By  performing  radial  edge  detection  around 
(x,  y),  we  can  detect  the  target  boundary.  To  do  so,  construct  several  line  segments  extending 
radially  from (x,  y)  with  coordinates  (la  y(k),lf/  v(k))  (see  Fig.  3), 

lg  x  (k)  =  x  +  r(k)  cos  0, 

l0y(k)  =  y  +  r(k)sin  0,  0  =  —27t,  kx  =  0,1,- Nx  -1, 
k 

r(k)  =  r1+-(r2-rl),  rx<r2,  k  =  0,1,2, •••, K, 

K 

where  N  is  the  number  of  line  segments,  K  + 1  is  the  number  of  points  on  each  line,  0  specifies 
the  orientation  of  the  line  segment,  and  rx  and  r2  are  pre-specified  values  delimiting  the  length 
of  the  line  segments. 

In  Cui  et  al  (2006),  where  the  following  one-dimensional  edge  detection  operator  is  applied  on 
each  line  segment, 

ee{k)  =  \I  'e{k  -  2)  +  2/ '  0{k  —  1)  -  21  'g{k  +  1)  -  / 'g(k  +  2)| . 

Here  we  use  a  modified  edge  detection  operator  on  each  line  segment 

ee(k)  =  V9  (k-2)  +  2V  d  (k-X)- 21' g  (k  +  l)-  V  e  (k  +  2), 


where  I'0(k)  is  the  image  intensity  at  point  (lOx(k)J0  xik))  obtained  by  bilinear  interpolation. 
The  corresponding  coordinate,  denoted  as  (e0  x ,  e0  y ) ,  with  the  maximum  e0(k) ,  is  the  detected 
edge  point  for  the  orientation  0  for  a  bright  target  (or  with  the  minimum  e0(k)  for  a  dark  target). 
An  example  of  the  radial  edge  detection  ( N  =  8 )  is  illustrated  in  Fig.  3. 


Fig.  3.  (a)  Detected  dark  leukocyte  edge;  (b)  Detected  bright  leukocyte  edge; 

(c)  Detected  vehicle  edge. 
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3.3  Sample  Weighting 

To  ensure  a  sample  is  in  the  inside  of  a  bright  target,  all  ed(k),  k  =  1, 2, •  •  •, 8 ,  should  larger  than 
a  statically  determined  threshold.  Considering  the  effect  of  the  noise  and  clutter  and  weak  image 
intensity  features,  if  7  of  eg(k),  k=  l,2,--,8,  are  larger  than  a  threshold,  we  will  assume  that 
the  sample  is  inside  the  target.  Similarly,  if  7  of  eg{k),  k  =  l,2,---,8  ,  are  smaller  than  a 
threshold,  we  consider  the  sample  is  inside  a  dark  target.  If  a  sample  is  inside  a  target,  the  sample 
weighting  is  set  to  be  a  nonzero  number,  otherwise,  it  is  set  to  be  zero. 

For  a  target  with  a  nonzero  weighting,  we  define  c/,1'"’  to  measure  the  displacement  between 
and  (xct+l,yct+l) .  The  importance  weight  izy'" 1  should  be  large  when  the  displacement  is 
small.  In  our  algorithm,  we  define 


where  2  <  a  <  5  .  After  normalization,  the  weighting  we  assign  to  the  sample  is 


7T(m)  =  _ 

n  1+ 1  M 


71 


(m) 

(+1 


K 


(in') 

f+1 


m’= 1 


3.4.  Feedback  in  Image  Processing 

The  concept  of  feedback  is  followed  throughout  our  algorithm  development.  For  example,  the 
adjustment  of  the  threshold  in  image  intensity  measurement  and  the  determination  of  the  density 
and  number  of  samples  are  all  based  on  the  previous  step  data. 

3.5.  Some  Observations 


The  success  of  tracking  algorithm  of  Cui  et  al.  (2006)  is  highly  dependent  on  the  accuracy  of  the 
predicted  position  of  the  target.  If  the  target  is  a  few  pixels  away  from  predicted  position,  the 
tracker  will  lose  the  target.  The  Cui  et  al.  (2006)  method  also  needs  the  target  positions  in  the 
first  two  frames  to  initialize  the  process  to  make  sure  the  predicted  position  is  accurate. 

Our  proposed  tracker  can  track  a  target  even  when  it  is  many  pixels  away  from  the  predicted 
position.  Also,  we  need  only  the  target  position  in  the  first  frame  to  begin  the  tracking. 


4.  Comparison  with  Monte  Carlo  Tracker  and  GVF  Snake  Tracker 

We  will  evaluate  the  performance  our  proposed  particle  filter  with  feedback  tracker  (PFF)  with 
those  of  the  Monte  Carlo  (MC)  tracker  and  the  GVF  snake  tracker  on  30  sequences.  Each 
sequence  consists  of  90  frames.  This  comparison  will  show  the  superior  performance  of  the  PFF 
tracker  both  in  terms  of  tracking  accuracy  and  tracking  speed.  The  drastically  reduced  time 
required  by  the  PFF  tracker  makes  real-time  tracking  possible. 

All  simulations  were  carried  out  in  Matlab  7.1.0.246  (R14)  on  PC  with  an  Intel  Core  2  CPU 
2GHz  and  1GB  RAM.  Each  tracker  is  evaluated  in  the  following  three  aspects. 

(1)  Percentage  of  the  frames  tracked. 

(2)  Number  of  sequences  (out  of  30  sequences)  with  all  frames  tracked 
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(3)  Time  taken  to  process  each  sequence. 

Shown  in  Fig.  4  is  the  comparison  of  percentage  of  frames  tracks.  As  seen  in  the  figure,  with 
registration,  the  PFF  tracker  tracks  16%  more  frames  than  the  MC  tracker,  and  18%  more  frames 
than  the  GVF  snake  tracker.  Without  registration,  the  performance  of  the  PFF  tracker  remains 
almost  the  same,  while  the  performances  of  the  MC  tracker  and  the  GVF  snake  tracker  degrade 
drastically.  This  indicates  that  the  PFF  tracker  is  not  only  much  more  accurate  but  also  much 
more  robust. 


With  registration  Without  registration 


Fig.  4.  Percentage  of  the  frames  tracked. 

Shown  in  Fig.  5  is  the  number  of  sequences  (out  of  30  sequences)  with  all  90  frames  tracked, 
both  with  and  without  registration.  With  registration,  the  PFF  tracker  is  able  to  track  all  90 
frames  in  23  out  of  30  sequences,  the  MC  tracker  is  able  to  track  all  90  frames  in  18  sequences 
and  the  GVF  snake  tracker  is  able  to  track  all  90  frames  only  in  14  out  of  30  sequences.  Without 
registration,  the  number  of  sequences  with  all  90  frames  tracked  remains  the  same,  while  this 
number  for  the  GVF  snake  tracker  decreases  to  1/3  and  the  number  for  the  MC  tracker  decreases 
to  1/2. 


With  registration 


Without  registration 


Fig.  5.  Number  of  sequences  (out  of  30  sequences)  with  all  90  frames  tracked. 

Shown  in  Fig.  6  are  the  differences  in  numbers  of  frames  tracked  by  the  three  different  trackers  for 
each  of  the  30  sequences.  The  top  two  graphs  show  that  the  MC  tracker  and  GVF  tracker 
outperform  each  other  in  about  the  same  number  of  sequences  and  by  about  a  same  margin.  The 
middle  two  plots  show  that  the  PFF  tracker  drastically  outperforms  the  GVF  snake  tracker.  In 
particular,  with  registration,  in  only  two  of  the  30  sequences  does  the  GVF  snake  tracker  track  a 
few  more  frames  than  the  PFF  tracker.  On  the  other  hand,  there  are  8  sequences  in  which  the  PFF 
tracker  tracks  over  40  more  frames  than  the  GVF  snake  tracker,  and  in  another  two  sequences,  the 
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PFF  tracker  tracks  20  more  frames  than  the  GVF  snake  tracker.  Without  registration,  the  PFF 
tracker  outperforms  the  MC  tracker  and  the  GVF  snake  tracker  even  more  drastically.  The  bottom 
two  plots  show  that  the  PFF  tracker  also  outperforms  the  MC  tracker  by  a  similar  margin  as  it 
outperforms  the  GVF  snake  tracker. 

Shown  in  Fig.  7  is  the  average  time  required  for  computation  in  each  sequence.  In  measuring 
these  times,  the  times  used  to  read  data  from  the  hard  disk  is  not  included.  As  seen  in  Fig.  7,  the 
PFF  tracker  is  about  34  times  faster  than  the  GVF  snake  tracker  and  over  56  times  faster  than  the 
MC  tracker. 

In  order  to  compare  in  relatively  fair  manner,  the  main  time-consuming  part  of  the  algorithms 
should  all  be  written  in  a  same  programming  language.  We  have  chosen  C.  The  main  codes  of 
both  the  MC  tracker  and  the  PFF  tracker  were  written  in  C.  The  available  GVF  snake  tracker  is 
in  the  form  of  m-files  (gvf.m  and  movesnake.m),  which  requires  an  average  of  15.2683  seconds 
to  track  a  sequence.  The  most  time-consuming  code  in  GVF  snake  tracker  is  the  GVF  algorithm. 
When  we  replaced  gvf.m  by  a  C  code,  gvf.c,  found  at  http://www.iacl.ece.ihu.edu/resources/,  the 
required  time  decreases  to  5.24  seconds.  We  have  also  rewritten  movesnake.m  into  movesnake.c. 
Using  both  gvf.c  and  movesnake.c,  the  required  time  further  decreases  to  3.14  seconds  (see  Fig. 
8).  The  time  for  the  GVF  tracker  as  shown  in  Fig.  6  for  comparison  is  this  further  reduced  time 
of  3.14  seconds. 


With  registration 
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Fig.  6.  The  differences  in  numbers  of  frames  tracked  by  three  different  trackers  for  each  of 

the  30  sequences. 
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Fig.  7.  Average  time  required  for  tracking  in  each  sequence. 
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Fig.  8.  Average  time  per  sequence  required  by  the  GVF  snake  tracker  with  different  codes. 


5.  Multiple  Target  Tracking 

The  PFF  tracker  is  also  effective  in  tracking  multiple  targets.  Shown  in  Fig.  9  is  the  performance 
of  the  PFF  tracker  in  tracking  8  UAVs  in  a  video  sequence.  The  PFF  tracker  tracks  all  8  targets. 
On  the  other  hand,  the  MC  tracker  tracks  only  6  targets,  and  misses  2  targets.  More  specifically, 
it  tracks  target  #7  in  32  frames  out  of  the  79  frames,  and  tracks  target  #8  in  49  frames  out  of  150 
frames.  The  GVF  tracker  has  similar  performance  as  the  MC  tracker.  It  tracks  6  targets,  and 
misses  2  targets  (track  target  #5  in  46  frames  out  of  the  100  frames,  and  tracks  target  #8  in  49 
frames  out  of  150  frames).  We  observe  that  both  the  GVF  and  MC  trackers  miss  targets  when 
targets  pass  under  the  tree  or  are  close  to  road  side. 

Shown  in  Fig.  10  is  the  tracking  speeds  of  all  three  trackers,  indicating  the  superior  performance 
of  the  PFF  tracker. 
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Fig.  9.  The  PFF  tracker  tracks  all  8  targets  in  a  5  second  segment  of  a  typical  UAV  video 
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Fig.  10.  Time  required  in  tracking. 
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